Estimating High-Dimensional Models for Database Marketing by
نویسندگان
چکیده
Marketers acquire information on customers from various sources, creating large databases. To use this information effectively, they need to estimate binary response models (e.g., Bult and Wansbeek 1995) in high-dimensions, i.e. in the presence of numerous exogenous variables. This task entails serious challenges of computation, identification, and statistical inference. Because of computational complexity, binary response models are not easily scaleable from a dozen to hundreds of variables. Furthermore, without proper identification of model parameters, the effects of discrete variables such as gender cannot be estimated. Finally, when making marketing decisions, database marketers may ignore important variables and/or retain irrelevant ones due to incorrect statistical inference. To mitigate these problems, we propose a two-stage approach. In the first stage, we convert the maximum-likelihood problem for estimating continuous variables into an eigenvalue problem, which is faster to solve. In the second stage, we identify and estimate the effects of discrete variables, and assess their significance. We illustrate this approach by analyzing a catalog marketer’s high-dimensional customer database. Our empirical example reveals that only a few variables impact customer response. Such sparsity is invaluable because it enables marketers to use this approach with massive databases that may contain thousands of variables. Thus, they can utilize the information in large-scale databases to target profitable customers, even online shoppers, in real time. Estimating High-Dimensional Models for Database Marketing We live in ... a nearly empty world one in which there are millions of variables that in principle could affect each other but most of the time don’t. Herbert Simon (1983, p. 20). INTRODUCTION Database marketers create customer databases by acquiring facts about their customers from various sources. For example, motor-vehicle and voter registrations provide information on age, name, address, and telephone; credit-reporting agencies and mortgage transactions furnish income estimates; county records reveal personal information such as home value; census surveys provide geo-demographic data; purchase transactions at online and offline stores describe buying habits; credit card companies reveal spending patterns; airline companies determine travel patterns; and specialized market research surveys discover lifestyle patterns. All this information helps marketers to construct high-dimensional databases that typically contain hundreds of variables (e.g., Bessen 1993, Blattberg, Glazer and Little 1994). Having invested millions of dollars in information technology to capture a wealth of information on potential customers, database marketers want to use this information in planning and implementing marketing decisions (Bessen 1993, p. 156). However, the marketing analysis of large databases is a challenging task. When reviewing important issues in modeling large data sets in marketing, Balasubramanian, Gupta, Kamakura, and Wedel (1998, p. 320) noted that: “...the sheer size of available data places severe demands on computing power. While the capacity of computers in terms of speed and storage has increased... during the last two decades, the size of marketing databases ...increased even faster. ... This has revived interest in speeding up traditional numerical and statistical estimation methods...”
منابع مشابه
Isotonic single-index model for high-dimensional database marketing
While database marketers collect vast amounts of customer transaction data, its utilization to improve marketing decisions presents problems. Marketers seek to extract relevant information from large databases by identifying signi6cant variables and prospective customers. In small databases, they could calibrate logistic regression models via maximum-likelihood methods to determine signi6cant v...
متن کاملEstimating piping around bridge piers using the SSIIM software
Estimating maximum scour depth is normally required in order to determine the depth of bridge piles. The present paper uses the SSIIM Software which considers flow and sediment equations in a three dimensional manner. The software models flow field around circular bridge pile by resolving Navier-Stokes three-dimensional equations and k- method, and by temporary solution of flow field and cont...
متن کاملESTIMATING THE VULNERABILITY OF THE CONCRETE MOMENT RESISTING FRAME STRUCTURES USING ARTIFICIAL NEURAL NETWORKS
Heavy economic losses and human casualties caused by destructive earthquakes around the world clearly show the need for a systematic approach for large scale damage detection of various types of existing structures. That could provide the proper means for the decision makers for any rehabilitation plans. The aim of this study is to present an innovative method for investigating the seismic vuln...
متن کاملScaling and Fractal Concepts in Saturated Hydraulic Conductivity: Comparison of Some Models
Measurement of soil saturated hydraulic conductivity, Ks, is normally affected by flow patterns such as macro pore; however, most current techniques do not differentiate flow types, causing major problems in describing water and chemical flows within the soil matrix. This study compares eight models for scaling Ks and predicted matrix and macro pore Ks, using a database composed of 50 datasets...
متن کاملMarketing Margin Analysis of Jujube (Case Study: Birjand)
Among Medical products, jujube (Ziziphus jujuba Mill.) is very important due to its very high nutritional value. Jujube as one of the most valuable medicinal plants can play an important role in Iran's non-oil exports. In spite of this fact, unfortunately, the production, distribution and marketing of this product are confronted with a number of obstacles. Therefore, this study can help for ide...
متن کامل